Combination of 3 Types of Speech Recognizers for Anaphora Resolution
نویسندگان
چکیده
In this paper, we propose a method for anaphora resolution in speech understanding for a livelihood support robot. For robust speech recognition, we combine two types of speech recognizers; a large vocabulary continuous speech recognizer (LVCSR) and domain-specific speech recognizers (DSSR). One problem in the anaphora resolution is lack of the antecedent in the outputs. To solve the problem, we introduce 2 types of DSSRs; one medium-scale DSSR and several small DSSRs. In this paper, we describe the basic idea of our multiple speech recognizer first. The selection process in the recognizer is based on the similarity between the LVCSR and each DSSR. Then, by using the outputs from the LVCSR and the medium-scale DSSR, we resolve anaphoric expressions in the current output from a small-scale DSSR. The experimental result shows the effectiveness of our method.
منابع مشابه
Using Articulatory Knowledge in Automatic Speech Recognition
Over the years different types of speech recognizers have been proposed and tested. During the last decade (or maybe even longer) hidden Markov models (HMMs) seem to have a better performance than other types of speech recognizers, like e.g. rule-based speech recognizers. This state of affairs has led to a gap between speech technology on the one hand, and phonetics and phonology on the other. ...
متن کاملTowards Symmetric Multimodality: Fusion and Fission of Speech, Gesture, and Facial Expression
We introduce the notion of symmetric multimodality for dialogue systems in which all input modes (eg. speech, gesture, facial expression) are also available for output, and vice versa. A dialogue system with symmetric multimodality must not only understand and represent the user's multimodal input, but also its own multimodal output. We present the SmartKom system, that provides full symmetric ...
متن کاملSmartKom: Symmetric Multimodality in an Adaptive and Reusable Dialogue Shell
We introduce the notion of symmetric multimodality for dialogue systems in which all input modes (eg. speech, gesture, facial expression) are also available for output, and vice versa. A dialogue system with symmetric multimodality must not only understand and represent the user's multimodal input, but also its own multimodal output. We present the SmartKom system, that provides full symmetric ...
متن کاملDialogue Systems Go Multimodal: The SmartKom Experience
Multimodal dialogue systems exploit one of the major characteristics of humanhuman interaction: the coordinated use of different modalities. Allowing all of the modalities to refer to and depend upon each other is a key to the richness of multimodal communication. We introduce the notion of symmetric multimodality for dialogue systems in which all input modes (e.g., speech, gesture, facial expr...
متن کاملAnaphora for Everyone: Pronominal Anaphora Resolution without a Parser
We present an algorithm for anaphora resolution which is a modified and extended version of that developed by (Lappin and Leass, 1994). In contrast to that work, our algorithm does not require in-depth, full, syntactic parsing of text. Instead, with minimal compromise in output quality, the modifications enable the resolution process to work from the output of a part of speech tagger, enriched ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010